匹配分组

代码	功能
\|	匹配左右任意一个表达式
(ab)	将括号中字符作为一个分组
\num	引用分组num匹配到的字符串
`(?P<name>)`	分组别名
`(?P=name)`	引用别名为name的分组匹配到的字符串

匹配练习

需求：在列表中[``**'apple'**``, **'banana'**``, **'orange'**``, **'pear'**``]，匹配apple和pear
需求：匹配出163、126、qq等邮箱
需求：匹配qq:10567这样的数据，提取出来qq文字和qq号码
需求：匹配出`hh
需求：匹配出<html><h1>www.google.com</h1></html>
需求：匹配出<html><h1>www.google.com</h1></html>

代码实现：

import re

# | 匹配左右任意一个表达式
# (ab)  将括号中字符作为一个分组
# \num  引用分组num匹配到的字符串
# (?P<name>)    分组别名
# (?P=name) 引用别名为name的分组匹配到的字符串

# 1. 需求：在列表中['apple', 'banana', 'orange', 'pear']，匹配apple和pear
# 2. 需求：匹配出163、126、qq等邮箱
# 3. 需求：匹配qq:10567这样的数据，提取出来qq文字和qq号码
# 4. 需求：匹配出<html>hh</html>
# 5. 需求：匹配出<html><h1>www.itcast.cn</h1></html>
# 6. 需求：匹配出<html><h1>www.itcast.cn</h1></html>

# 1. 需求：在列表中['apple', 'banana', 'orange', 'pear']，匹配apple和pear
# | 匹配左右任意一个表达式
# fruit = ['apple', 'banana', 'orange', 'pear']
#
# for value in fruit:
#     result = re.match('apple|pear', value)
#     # 判断匹配是否成功
#     if result:
#         info = result.group()
#         print(f'我想吃的水果：{value}')
#     else:
#         print(f'这个不是我想吃的水果')

# 2. 需求：匹配出163、126、qq等邮箱
# | 匹配左右任意一个表达式
# (ab)  将括号中字符作为一个分组
# \ 转义字符
# result = re.match('[a-zA-Z0-9_]{4,20}@(163|126|qq)\.com', '[email protected]')
# if result:
#     info = result.group()
#     print(info)
# else:
#     print('没有匹配到')

# 3. 需求：匹配qq:10567这样的数据，提取出来qq文字和qq号码
# group(0) 地表的是匹配的所有数据 (1): 第一个分组的数据 2: 第二个分组的数据，顺序是从左到右依次排序
# result = re.match('(qq):([1-9]\d{4,11})', 'qq:10567')
# if result:
#     info = result.group()
#     print(info)
#
#     num = result.group(2)
#     print(num) # 10675
#
#     type = result.group(1)
#     print(type) # qq
# else:
#     print('没有匹配到')

# 4. 需求：匹配出<html>hh</html>
# \num  引用分组num匹配到的字符串
# result = re.match('<([a-zA-Z1-6]{4})>.*</\\1>', '<html>hhh</html>') # \1 第一个分组的内容
# if result:
#     info = result.group()
#     print(info)
# else:
#     print('没有匹配到')

# 5. 需求：匹配出<html><h1>www.itcast.cn</h1></html>
# result = re.match('<([a-zA-Z1-6]{4})><([a-zA-Z1-6]{2})>.*</\\2></\\1>', '<html><h1>www.itcast.cn</h1></html>') # \1 第一个分组的内容
# if result:
#     info = result.group()
#     print(info)
# else:
#     print('没有匹配到')

# 6. 需求：匹配出<html><h1>www.itcast.cn</h1></html>
# (?P<name>)    分组别名
# (?P=name) 引用别名为name的分组匹配到的字符串
result = re.match('<(?P<html>[a-zA-Z1-6]{4})><(?P<h1>[a-zA-Z1-6]{2})>.*</(?P=h1)></(?P=html)>', '<html><h1>www.itcast.cn</h1></html>') # \1 第一个分组的内容
if result:
    info = result.group()
    print(info)
else:
    print('没有匹配到')